A program to perform Ward's clustering method on several regionalized variables

نویسندگان

  • Carme Hervada-Sala
  • Eusebi Jarauta-Bragulat
چکیده

Earth science studies deal in general with multivariate and regionalized observations which may be compositional. Sometimes, it is interesting to know whether these data have to be divided into different subpopulations, a task usually performed by cluster analysis. This problem cannot be studied with traditional methods because samples are not independent. In that case, an extension of Ward’s clustering method to spatially dependent samples can be used. This methodology is based on a generalized Mahalanobis distance, which uses the covariance and cross covariance (or variogram and cross-variogram) matrices. In its original version, the method was iterative and tedious, as it was necessary to re-estimate the spatial covariance structure at each step. In this work, we stay within the same theoretical framework, but we improve the methodology using the Fast Fourier Transform (FFT) method to find the covariance structure. Thus, we obtain a generalization to several variables of adapted Ward’s clustering method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Methods for detecting functional classifications in neuroimaging data.

Data-driven statistical methods are useful for examining the spatial organization of human brain function. Cluster analysis is one approach that aims to identify spatial classifications of temporal brain activity profiles. Numerous clustering algorithms are available, and no one method is optimal for all areas of application because an algorithm's performance depends on specific characteristics...

متن کامل

Generalising Ward’s Method for Use with Manhattan Distances

The claim that Ward's linkage algorithm in hierarchical clustering is limited to use with Euclidean distances is investigated. In this paper, Ward's clustering algorithm is generalised to use with l1 norm or Manhattan distances. We argue that the generalisation of Ward's linkage method to incorporate Manhattan distances is theoretically sound and provide an example of where this method outperfo...

متن کامل

Clustering Large Data Sets Described With Discrete Distributions and An Application on TIMSS Data Set

Symbolic Data Analysis is based on a special descriptions of data – symbolic objects. Such descriptions preserve more detailed information about data than the usual representations with mean values. A special kind of symbolic object is also representation with distributions. In the clustering process this representation enables us to consider the variables of all types at the same time. We pres...

متن کامل

PRFM Model Developed for the Separation of Enterprise Customers Based on the Distribution Companies of Various Goods and Services

In this study, a new model of combining variables affecting the classification of customers is introduced which is based on a distribution system of goods and services. Given the problems that the RFM model has in various distribution systems, a new model for resolving these problems is presented. The core of this model is the older RFM. The new model that has been proposed as PRFM, consists of...

متن کامل

Ward's Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward's Criterion?

The Ward error sum of squares hierarchical clustering method has been very widely used since its first description by Ward in a 1963 publication. It has also been generalized in various ways. Two algorithms are found in the literature and software, both announcing that they implement the Ward clustering method. When applied to the same distance matrix, they produce different results. One algori...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computers & Geosciences

دوره 30  شماره 

صفحات  -

تاریخ انتشار 2004